{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 读取数据\n",
    "\n",
    "读入酒店评论数据,统计正面评价和负面评价,共计9999条评论数据。**数据均在ChnSentiCorp_htl_unba_10000文件夹下。**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "积极评论 6900 条\n",
      "消极评论 2999 条\n",
      "<class 'pandas.core.frame.DataFrame'>\n",
      "Int64Index: 9899 entries, 0 to 2998\n",
      "Data columns (total 2 columns):\n",
      "comment    9899 non-null object\n",
      "pos_neg    9899 non-null int64\n",
      "dtypes: int64(1), object(1)\n",
      "memory usage: 232.0+ KB\n",
      "None\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>comment</th>\n",
       "      <th>pos_neg</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>距离川沙公路较近,但是公交指示不对,如果是\"蔡陆线\"的话,会非常麻烦.建议用别的路线.房间较...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>商务大床房，房间很大，床有2M宽，整体感觉经济实惠不错!</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>早餐太差，无论去多少人，那边也不加食品的。酒店应该重视一下这个问题了。房间本身很好。</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>宾馆在小街道上，不大好找，但还好北京热心同胞很多~宾馆设施跟介绍的差不多，房间很小，确实挺小...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CBD中心,周围没什么店铺,说5星有点勉强.不知道为什么卫生间没有电吹风</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                             comment  pos_neg\n",
       "0  距离川沙公路较近,但是公交指示不对,如果是\"蔡陆线\"的话,会非常麻烦.建议用别的路线.房间较...        1\n",
       "1                       商务大床房，房间很大，床有2M宽，整体感觉经济实惠不错!        1\n",
       "2         早餐太差，无论去多少人，那边也不加食品的。酒店应该重视一下这个问题了。房间本身很好。        1\n",
       "3  宾馆在小街道上，不大好找，但还好北京热心同胞很多~宾馆设施跟介绍的差不多，房间很小，确实挺小...        1\n",
       "4               CBD中心,周围没什么店铺,说5星有点勉强.不知道为什么卫生间没有电吹风        1"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "import numpy as np\n",
    "import os\n",
    "from time import time\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "import pandas as pd\n",
    "import jieba\n",
    "\n",
    "from sklearn.datasets import fetch_20newsgroups#引入新闻数据包\n",
    "from sklearn.feature_extraction.text import TfidfVectorizer#做tfidf编码\n",
    "from sklearn.feature_selection import SelectKBest, chi2#卡方检验——特征筛选\n",
    "from sklearn.linear_model import RidgeClassifier\n",
    "from sklearn.svm import LinearSVC,SVC\n",
    "from sklearn.naive_bayes import MultinomialNB, BernoulliNB #引入多项式和伯努利的贝叶斯\n",
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "from sklearn.ensemble import RandomForestClassifier\n",
    "from sklearn.model_selection import GridSearchCV\n",
    "# # SMOTE上采样\n",
    "from sklearn.model_selection import train_test_split\n",
    "# from imblearn.over_sampling import SMOTE\n",
    "from collections import Counter\n",
    "from sklearn import metrics \n",
    "plt.rcParams['font.sans-serif'] = ['SimHei']  \n",
    "\n",
    "# # 数据路径\n",
    "os.chdir('D:\\codePractice\\data\\\\ChnSentiCorp_htl_unba_10000\\\\')\n",
    "\n",
    "# 读取正面评论\n",
    "positive = pd.read_csv('./pos.csv')\n",
    "# 增加极性标签 \n",
    "positive['pos_neg']=1\n",
    "# 正面评论数据统计\n",
    "pos_len = len(positive)\n",
    "print(\"积极评论\",pos_len,\"条\")\n",
    "# positive.head()\n",
    "# positive.info()\n",
    "\n",
    "negative = pd.read_csv('./neg.csv')\n",
    "# 增加极性标签 \n",
    "negative['pos_neg']=-1\n",
    "# 负面评论\n",
    "neg_len = len(negative)\n",
    "print(\"消极评论\",neg_len,\"条\")\n",
    "# negative.head()\n",
    "# negative.info()\n",
    "\n",
    "# 合并评论\n",
    "comments = pd.concat([positive,negative])\n",
    "\n",
    "print(comments.info())\n",
    "# 9899条\n",
    "comments.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据预处理"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数据划分\n",
    "\n",
    "将数据划为特征x和标签y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "总评论中的正负样本: Counter({1: 6900, -1: 2999})\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>comment</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>距离川沙公路较近,但是公交指示不对,如果是\"蔡陆线\"的话,会非常麻烦.建议用别的路线.房间较...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>商务大床房，房间很大，床有2M宽，整体感觉经济实惠不错!</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>早餐太差，无论去多少人，那边也不加食品的。酒店应该重视一下这个问题了。房间本身很好。</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>宾馆在小街道上，不大好找，但还好北京热心同胞很多~宾馆设施跟介绍的差不多，房间很小，确实挺小...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CBD中心,周围没什么店铺,说5星有点勉强.不知道为什么卫生间没有电吹风</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                             comment\n",
       "0  距离川沙公路较近,但是公交指示不对,如果是\"蔡陆线\"的话,会非常麻烦.建议用别的路线.房间较...\n",
       "1                       商务大床房，房间很大，床有2M宽，整体感觉经济实惠不错!\n",
       "2         早餐太差，无论去多少人，那边也不加食品的。酒店应该重视一下这个问题了。房间本身很好。\n",
       "3  宾馆在小街道上，不大好找，但还好北京热心同胞很多~宾馆设施跟介绍的差不多，房间很小，确实挺小...\n",
       "4               CBD中心,周围没什么店铺,说5星有点勉强.不知道为什么卫生间没有电吹风"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# # # 样本不均衡\n",
    "x = pd.DataFrame(comments.comment)\n",
    "y = comments.pos_neg\n",
    "# 查看SMOTE之前的数据分布 \n",
    "print(\"总评论中的正负样本:\",Counter(y))\n",
    "# Counter({1: 6900, -1: 2999})\n",
    "\n",
    "# # 定义SMOTE模型，random_state相当于随机数种子的作用\n",
    "# smo = SMOTE(random_state=42)\n",
    "# X_smo, y_smo = smo.fit_sample(x, y)\n",
    "\n",
    "# # 查看SMOTE之后的数据分布 \n",
    "# print(Counter(y_smo))\n",
    "\n",
    "# # 数据集划分\n",
    "# x_train,x_test,y_train,y_test = train_test_split(x,y,\n",
    "#                                                  test_size=0.2,\n",
    "#                                                  random_state=0)\n",
    "\n",
    "# # # 训练集\n",
    "# x_train[:5]\n",
    "# print(\"训练集中的正负样本：\",Counter(y_train))\n",
    "\n",
    "# # # 测试集\n",
    "# x_test[:5]\n",
    "# print(\"测试集中的正负样本：\",Counter(y_test))\n",
    "\n",
    "x.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 文档转化向量"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 读取停用词\n",
    "\n",
    "读取停用词词典，共计1234个停用词。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "一共含有1207个停用词\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "['如下', '汝', '三番两次', '三番五次', '三天两头', '瑟瑟', '沙沙', '上', '上来', '上去']"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 读取停用词词库\n",
    "def get_stopwords(stop_words_file):\n",
    "    with open(stop_words_file,'r',encoding='utf-8') as f:\n",
    "        stopwords = f.read()\n",
    "    stopwords_list = stopwords.split('\\n')\n",
    "    custom_stopwords_list = [i for i in stopwords_list]\n",
    "    return custom_stopwords_list\n",
    "\n",
    "\n",
    "# 中科院停用词\n",
    "stop_words_file =\"./stopWord.txt\"\n",
    "# 读取停用词 \n",
    "stopwords = get_stopwords(stop_words_file)[1:]\n",
    "# 统计停用词个数\n",
    "print(\"一共含有%d个停用词\" % len(stopwords))\n",
    "# 查看前10个停用词\n",
    "stopwords[-10:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 分词\n",
    "\n",
    "结巴分词，并去除英文和数字"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "\n",
    "# 分词预处理\n",
    "def clean_text(line):\n",
    "    if line!=' ':\n",
    "        line = line.strip()\n",
    "        # 去除英文和数字\n",
    "        line = re.sub(\"[a-zA-Z0-9]\",\"\",line)\n",
    "        # 去除文本中的中文符号和英文符号 \n",
    "        line = re.sub(\"[\\s+\\.\\,\\!\\/_,$%^*(+\\\"\\'；：“”．]+|[+——！，。？?、~@#￥%……&*（）]+\", \"\", line)\n",
    "        return line\n",
    "    else:\n",
    "        return \"Empyt Line.\"\n",
    "    \n",
    "# 结巴分词\n",
    "def chinese_word_cut(line):\n",
    "    line = clean_text(line)\n",
    "    segList = jieba.cut(line,cut_all=False)\n",
    "    segSentence = ''\n",
    "    for word in segList:\n",
    "        if word!='\\t' and (word not in stopwords):\n",
    "            segSentence += (word+\" \")\n",
    "    return segSentence.strip()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(9899, 2)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "0               距离 川沙 公路 较近 公交 指示 蔡陆线 麻烦 建议 路线 房间 较为简单\n",
       "1                    商务 大床 房 房间 很大 床有 宽 整体 感觉 经济 实惠 不错\n",
       "2                   早餐 太 差 人 不加 食品 酒店 应该 重视 一下 问题 房间 好\n",
       "3    宾馆 小 街道 不大好 找 还好 北京 热心 同胞 宾馆 设施 介绍 房间 很小 确实 挺 ...\n",
       "4                     中心 周围 没什么 店铺 说星 有点 勉强 知道 卫生间 电吹风\n",
       "Name: cutted_comment, dtype: object"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 评论分词 \n",
    "x['cutted_comment'] = x.comment.apply(chinese_word_cut)\n",
    "print(x.shape)\n",
    "# (9899, 2)\n",
    "\n",
    "# # 分词结果\n",
    "x.cutted_comment[:5]\n",
    "## BOW TFIDF WORD2VECh获取特征 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 文本向量表示\n",
    "\n",
    "利用词袋模型将文本向量化。\n",
    "- 利用停用词过滤;\n",
    "- max_df = 0.9设置为0.9，即一个词语在90%以上的文件中出现则不能作为特征词语;\n",
    "- min_df=3设置为3，即一个词语最少在3个文件中出现过才能当作特征词语;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import CountVectorizer\n",
    "# 过滤部分特征\n",
    "# 最大比例\n",
    "max_df = 0.9\n",
    "# 最小支持度\n",
    "min_df = 3\n",
    "# 加入token\n",
    "token_pattern = u'(?u)\\\\b[^\\\\d\\\\W]\\\\w+\\\\b'\n",
    "\n",
    "# 构建cbow向量\n",
    "vectorizer = CountVectorizer(min_df = min_df,\n",
    "               token_pattern=token_pattern,\n",
    "               stop_words=frozenset(stopwords))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 训练集和测试集划分\n",
    "\n",
    "按照8:2将数据集划为为训练集和测试集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "训练集中的正负样本： Counter({1: 5516, -1: 2403})\n",
      "训练集样本个数:7919,特征个数:8347\n",
      "测试集中的正负样本： Counter({1: 1384, -1: 596})\n",
      "测试集样本个数:1980,特征个数:8347\n",
      "\n",
      "特征筛选前特征个数:8347\n",
      "词汇:['一一' '一下' '一下下' ... '龙卡' '龙头' '龙门石窟']\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "5279                交通 便利 环境 尚可 提供 早餐 比较 好 旧 一点 总体 说 令人满意\n",
       "2929                                                喜欢 下次\n",
       "5496    六号楼 龙岩 地区 目前 最好 宾馆 硬件 新 服务 感觉 不错 美中不足 暖气 不行 晚上...\n",
       "5253    白天 恐龙 园 游玩 过后 月 号 富 商贸 酒店 住 一晚 住 元 高级 双床 房 最 普...\n",
       "3078                  总体 感觉 比较 好 环境 一点 吵 挨近 一条 高架路 服务态度 好\n",
       "Name: cutted_comment, dtype: object"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 数据集划分\n",
    "x_train,x_test,y_train,y_test = train_test_split(x,y,\n",
    "                                                 test_size=0.2,\n",
    "                                                 random_state=0)\n",
    "\n",
    "# 训练集和测试集\n",
    "x_train = x_train['cutted_comment']\n",
    "x_test = x_test['cutted_comment']\n",
    "\n",
    "# 文本向量化\n",
    "train_x = vectorizer.fit_transform(x_train)\n",
    "test_x = vectorizer.transform(x_test)\n",
    "\n",
    "# # 训练集\n",
    "x_train[:5]\n",
    "print(\"训练集中的正负样本：\",Counter(y_train))\n",
    "print(\"训练集样本个数:%d,特征个数:%d\" % train_x.shape)\n",
    "\n",
    "# # 测试集\n",
    "x_test[:5]\n",
    "print(\"测试集中的正负样本：\",Counter(y_test))\n",
    "print(\"测试集样本个数:%d,特征个数:%d\\n\" % test_x.shape)\n",
    "\n",
    "feature_names = np.asarray(vectorizer.get_feature_names())\n",
    "print(\"特征筛选前特征个数:{}\\n词汇:{}\".format(len(feature_names),feature_names))\n",
    "# # 训练文本\n",
    "x_train.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 特征选择\n",
    "\n",
    "上一步可以看到，通过停用词和min_df、max_df的筛选之后，特征词语有8347个，为了降维，此步通过卡方进行特征选择。卡方进行特征值选择之后保留与y最相关的1000个词。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "特征筛选后特征个数:1000\n",
      "\n",
      "1000个特征词汇为:\n",
      " ['一下', '一个', '一会', '一分', '一分货', '一口咬定', '一句', '一只', '一块', '一堆', '一塌糊涂', '一夜', '一天', '一座', '一张', '一无是处', '一早', '一晚', '一次', '一次性', '一流', '一直', '一看', '一碰', '一米', '一股', '一间', '万豪', '三个', '三思', '三星', '三星级', '上当', '上面', '下午', '下咽', '下来', '下楼', '下次', '下水管', '下水道', '不住', '不信', '不值', '不停', '不到', '不去', '不可思议', '不好', '不想', '不敢', '不敢恭维', '不来', '不热', '不爽', '不理', '不知', '不符', '不肯', '不行', '不见', '不象话', '不负责任', '不足', '不远', '不错', '东西', '丢人', '两个', '两张', '两样', '严重', '中午', '中央空调', '中年男人', '丰富', '丰盛', '临时', '为此', '久久', '义务', '之下', '之后', '之差', '之星', '九洲', '乱收费', '争执', '事件', '事先', '事情', '二十分钟', '二星', '交涉', '交警', '交通', '产生', '亲切', '亲戚', '人员', '人理', '今后', '从没', '付钱', '代理', '令人', '令人气愤', '以上', '以为', '以后', '以来', '价格合理', '企业', '休闲', '会员', '会员卡', '传真', '估计', '位置', '低下', '低档', '低级', '住天', '住店', '依然', '便利', '保安', '保证', '信任', '信号', '信用卡', '修好', '修理', '值得', '值班', '偏僻', '做事', '做出', '停电', '催促', '傻子', '先到', '免费', '入主', '入住', '入睡', '全价', '全国', '全是', '全部', '八点', '公园', '公安部门', '六点', '关不上', '关不紧', '关了', '其他人', '具体', '典型', '再也', '再也不会', '再住', '再换', '再次', '写字台', '写明', '冰冷', '冲凉', '决不会', '冷冰冰', '冷水', '冷淡', '冷热水', '冻死', '冻结', '凉水', '凌晨', '凑合', '几个', '几十块', '凯悦', '出奇', '出水', '出现', '出面', '分数', '分钟', '划算', '刚刚', '删除', '到位', '到极点', '到达', '刷卡', '刺鼻', '剃须刀', '前台', '剥落', '剩下', '办公楼', '办法', '办理', '加收', '劣质', '动手', '勉强', '千万别', '升级', '半个', '半夜', '半天', '半小时', '半岛', '协议', '单据', '卡车', '卫生', '卫生条件', '卫生间', '厕所', '厕纸', '原本', '原来', '反感', '反映', '反正', '反锁', '发现', '发生', '发票', '发脾气', '发誓', '发霉', '发黄', '发黑', '取消', '受不了', '受气', '变成', '口味', '只好', '只能', '只见', '可怕', '可怜', '可恶', '可想而知', '可气', '可笑', '号称', '合作', '合作伙伴', '合理', '吉林省', '同事', '同意', '同样', '后悔', '后来者', '后面', '吓人', '君悦', '听到', '吵得', '吵架', '告知', '告诉', '周到', '周末', '呵呵', '咸菜', '品种', '商场', '喀什', '喇叭', '喜来登', '喜欢', '噩梦', '噪音', '四星', '四星级', '四点', '回复', '回来', '回答', '图片', '地上', '地下室', '地址', '地方', '地板', '地步', '地毯', '地理位置', '坑人', '坚决', '坚持', '垃圾', '堵塞', '塑料', '墙体', '墙壁', '墙纸', '声音', '处理', '备用', '复杂', '外机', '外面', '多年', '多收', '夜里', '大声', '大失所望', '大跌眼镜', '太乱', '太低', '太偏', '太吵', '太差', '太旧', '太烂', '太高', '失望', '头发', '奇差', '奇慢', '奉劝', '好不容易', '好像', '好点', '如家', '字差', '安装', '安静', '完全', '定单', '定房', '宜家', '实在', '实惠', '实际', '客人', '客户', '客房', '客房部', '客服', '客观', '家具', '宽敞', '宾馆', '密封', '对不起', '对方', '对此', '导航', '尊重', '小到', '小姐', '小得', '小时', '少得', '尤其', '就让', '屋子里', '屋里', '岂有此理', '崩溃', '工地', '差劲', '差多', '差太差', '差差', '差得', '差远了', '已经', '市中心', '市场经济', '师傅', '帐单', '帐户', '席梦思', '干净', '干吗', '平安', '平方米', '平遥', '年代', '幽静', '广告业务', '广场', '床上', '床单', '店大欺客', '度假', '延时', '延迟', '建议', '开发票', '开房', '开裂', '开门', '异味', '弄脏', '弄脏了', '弥漫着', '弹簧', '强烈', '强烈建议', '强烈要求', '当天', '当时', '形容', '很大', '很小', '很差', '很快', '很棒', '很浓', '很漂亮', '很烂', '很脏', '很近', '很远', '得到', '心寒', '心情', '忍受', '快捷酒店', '怀疑', '态度', '态度恶劣', '态度生硬', '怎么回事', '性价比', '怪味', '总体', '总台', '总算', '恐怖', '恶劣', '恶心', '恼火', '情况', '情愿', '惊喜', '惬意', '想想', '想着', '愉快', '意思', '意识', '感冒', '感动', '感觉', '感觉不好', '慎重', '慎重考虑', '懒得', '我用', '我能', '我花', '我要', '房价', '房到', '房卡', '房客', '房租', '房费', '房钱', '房门', '房间', '所有', '所谓', '手机', '手段', '手续', '手续费', '才行', '扑鼻', '打不开', '打个', '打印', '打开', '打扫', '打电话', '打算', '打车', '批评', '找钱', '承担', '承认', '承诺', '投诉', '折腾', '报警', '抱歉', '押金', '抽水马桶', '拒绝', '拖拉机', '招待所', '拨打', '持续', '损坏', '换房', '授权', '掉下来', '排气扇', '接到', '接线', '接线员', '描述', '提出', '插座', '搞笑', '搞错', '搬家', '搬走', '携城', '携程', '携程网', '支付', '收到', '收拾', '收费', '收银', '改进', '放在', '教育', '散步', '敲门', '整个', '整体', '整晚', '整洁', '斑驳', '方便', '方巾', '施工', '旅店', '旅游局', '旅馆', '旋转', '无奈', '无法', '无能为力', '无表情', '无语', '早上', '早就', '早晨', '早餐', '早餐券', '时说', '时间', '明明', '明显', '明白', '昏暗', '星座', '星级', '昨晚', '是不是', '是否', '晓得', '晕倒', '晕死', '晚上', '景色', '暖气', '曲阜', '更别', '更换', '更让人', '最可气', '最后', '最多', '最好', '最小', '最差', '最烂', '最让人', '有人', '有房', '有无', '有没有', '有点', '服务', '服务员', '服务周到', '服务质量', '本店', '本来', '机器', '杜绝', '来说', '杭州', '杯子', '极低', '极小', '极差', '查房', '标准间', '标榜', '样子', '核对', '根本', '根本就是', '桌子', '椅子', '欺诈', '欺骗', '歉意', '正常', '正规', '正面', '此事', '此种', '步行', '死活', '段时间', '每间', '比例', '比较', '比较满意', '毛巾', '气味', '气愤', '气死', '气派', '水土不服', '水平', '水果', '水管', '水箱', '水龙头', '永远', '汉庭', '汕尾', '污水', '污渍', '污迹', '沟通', '没人', '没人来', '没想到', '没敢', '没法', '没洗', '没用', '没睡', '没窗', '没见', '没门', '河南人', '油漆', '注册', '洗不掉', '洗个', '洗手', '洗洁精', '洗涤', '洗澡', '洗脸', '浴巾', '海景', '海鲜', '消毒', '消费', '消费者', '淘汰', '深表歉意', '混乱', '清楚', '温馨', '游泳池', '满意', '漂亮', '漏水', '潮湿', '灰尘', '点前', '点半', '点多', '点评', '烂尾楼', '烟味', '烧水', '热情', '热情周到', '热水', '照片', '燕赵', '牙刷', '物品', '物有所值', '物超所值', '特价', '特色', '状态', '狡辩', '狭小', '环境', '现在', '现金', '理由', '理直气壮', '理睬', '理论', '甚远', '生存', '生手', '生日', '生气', '用户', '用车', '用过', '甲醛', '电梯', '电源', '电灯', '电视', '电视信号', '电话', '男友', '男子', '男服务员', '略显', '登记', '白色', '皇家', '皮球', '监控', '盒子', '盖章', '直到', '直接', '相信', '相关', '相同', '相差', '看看', '看着', '真不知道', '真是', '真是太', '真的', '真诚', '眼睛', '睡不着', '睡个', '睡着', '睡觉', '知道', '矿泉水瓶', '破旧', '破洞', '确定', '确实', '确认', '福州', '离开', '积分', '稍后', '空调', '突然', '窗户', '竟是', '第一', '第一个', '第一印象', '第一次', '第一间', '第三', '第二', '第二天', '第二间', '第五', '第四', '第天', '等待', '答复', '答应', '签字', '签约', '简陋', '算了', '算是', '管理', '管理人员', '箱子', '精致', '糟糕', '素质', '索引', '细心', '细节', '终于', '经历', '经理', '结帐', '结算', '绝不会', '维修', '缺点', '网上', '网站', '网管', '网线', '美中不足', '老鼠', '考察', '聊天', '联系', '肮脏', '脏乱差', '脏兮兮', '脏得', '脾气', '自动', '自助餐', '自来水', '自称', '臭味', '致电', '舒服', '舒适', '花园', '英才', '莫泰', '蛮横', '蟑螂', '血迹', '行为', '行政', '街边', '补偿', '补充', '表示', '被单', '被子', '被罩', '裸露', '西湖', '要加', '要命', '要收', '要死', '要求', '见过', '规定', '解决', '解决问题', '解释', '订单', '认识', '讲话', '设备陈旧', '设置', '证明', '评分', '诚信', '诚意', '诚聘', '误导', '说明', '说水', '说话', '请问', '调换', '调整', '调查', '调试', '谈不上', '象是', '负责人', '责任', '质疑', '质问', '购物', '贴心', '贵阳', '费用', '赔偿', '赚钱', '赠送', '走人', '走廊', '赶紧', '起床', '超值', '超小', '超差', '足疗', '足足', '跟前', '跟着', '跟踪', '跳蚤', '身份', '身份证', '转到', '转身', '轰鸣', '较差', '过分', '过敏', '过来', '过道', '过问', '近点', '返还', '还会', '还要', '这家', '这种', '这话', '进行', '进门', '连个', '连锁', '迹象', '退房', '适中', '适合', '逃走', '通知', '遇到', '道歉', '遗憾', '遥控器', '遭遇', '那间', '郁闷', '鄙视', '酒店', '里面', '金牌', '针对', '钟才', '钟点房', '钥匙', '钻石', '银行', '锅炉房', '错误', '锦地', '锦江', '长期', '门卡', '门卫', '门锁', '问题', '闹中取静', '闻所未闻', '阳台', '阴暗', '陈旧', '随后', '隔壁', '隔音', '难以', '难以忍受', '难受', '难吃', '难闻', '震惊', '霉味', '非常重视', '面上', '鞍山', '顾客', '预付', '预定', '预订', '领教', '风扇', '风景', '风格', '飞机', '食堂', '餐具', '餐券', '饮水机', '饮用水', '馒头', '首选', '香港', '马桶', '骗人', '高价', '鸡蛋', '黄色', '黑乎乎', '黑店', '黑色', '黑黑的', '齐全']\n"
     ]
    }
   ],
   "source": [
    "# # 特征选择\n",
    "ch2 = SelectKBest(chi2,k=1000)\n",
    "x_train = ch2.fit_transform(train_x,y_train)\n",
    "x_test = ch2.transform(test_x)\n",
    "feature_names = [feature_names[i] for i in \n",
    "                 ch2.get_support(indices=True)]\n",
    "\n",
    "# # 特征词查看\n",
    "print(\"特征筛选后特征个数:{}\\n\\n1000个特征词汇为:\\n {}\".format(len(feature_names),feature_names))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 构建不同的分类器\n",
    "\n",
    "- 构建多种分类器：线性最小二乘L2正则(RidgeClassiﬁer)、K近邻 (KNeighborsClassiﬁer)、朴素贝叶斯(MultinomialNB)、随机森林(RandomForestClassiﬁer)、SVM(SVC:hinge损失 的绝对值)、线性核SVM(LinearSVC:平方hinge损失+L1正则)、线性核SVM(LinearSVC:平方hinge损失+L2正则)\n",
    "- 通过网格搜索找到最优参数\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # 基准模型 \n",
    "def benchmark(clf,name):\n",
    "    print(\"分类器:\",clf)\n",
    "    \n",
    "    # 设置最优参数，5折交叉验证\n",
    "    alpha_can = np.logspace(-2,1,10)\n",
    "    model = GridSearchCV(clf,param_grid={'alpha':\n",
    "                                         alpha_can}, cv=5)\n",
    "    m = alpha_can.size\n",
    "    \n",
    "    # alpha参数\n",
    "    if hasattr(clf,'alpha'):\n",
    "        model.set_params(param_grid={'alpha':alpha_can})\n",
    "        m = alpha_can.size\n",
    "        \n",
    "    # k近邻\n",
    "    if hasattr(clf,'n_neighbors'):\n",
    "        neighbors_can = np.arange(1,15)\n",
    "        model.set_params(param_grid={'n_neighbors':\n",
    "                                     neighbors_can})\n",
    "        m = neighbors_can.size\n",
    "        \n",
    "    # LinearSVC参数\n",
    "    if hasattr(clf,'C'):\n",
    "        C_can = np.logspace(1,3,3)\n",
    "        model.set_params(param_grid={'C':C_can})\n",
    "        \n",
    "    # SVM参数\n",
    "    if hasattr(clf,'C') & hasattr(clf,'gamma'):\n",
    "        C_can = np.logspace(1,3,3)\n",
    "        gamma_can = np.logspace(-3,0,3)\n",
    "        model.set_params(param_grid={'C':C_can,'gamma':gamma_can})\n",
    "        m = C_can.size*gamma_can.size\n",
    "    # 深度参数\n",
    "    if hasattr(clf,'max_depth'):\n",
    "        max_depth_can = np.arange(4,10)\n",
    "        model.set_params(param_grid={'max_depth':max_depth_can})\n",
    "        m = max_depth_can.size\n",
    "        \n",
    "    # # 训练模型\n",
    "    t_start = time()\n",
    "    model.fit(train_x,y_train)\n",
    "    t_end = time()\n",
    "    t_train = (t_end-t_start)/(5*m)\n",
    "    print(\"5折交叉验证的训练时间为:%.3f/(5*%d)=%.3f秒\"\n",
    "         %((t_end-t_start),m,t_train))\n",
    "    print(\"最优参数为:\",model.best_params_)\n",
    "    \n",
    "    # # 模型预测 \n",
    "    t_start = time()\n",
    "    y_hat = model.predict(test_x)\n",
    "    t_end = time()\n",
    "    t_test = t_end -t_start\n",
    "    print(\"测试时间：%.3f秒\"%t_test)\n",
    "     \n",
    "    # # 模型评估 \n",
    "    # # 训练集\n",
    "    train_acc = metrics.accuracy_score(y_train,\n",
    "                                       model.predict(\n",
    "                                           train_x))\n",
    "    # # 测试集 \n",
    "    test_acc = metrics.accuracy_score(y_test,y_hat)\n",
    "    print(\"训练集准确率:%.2f%%\"%(100*train_acc))\n",
    "    print(\"测试集准确率:%.2f%%\"%(100*test_acc))\n",
    "    \n",
    "    # # 返回结果\n",
    "    return t_train,t_test,1-train_acc,1-test_acc,name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "分类器的比较:\n",
      "\n",
      "分类器: RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True, fit_intercept=True,\n",
      "        max_iter=None, normalize=False, random_state=None, solver='auto',\n",
      "        tol=0.001)\n",
      "5折交叉验证的训练时间为:41.859/(5*10)=0.837秒\n",
      "最优参数为: {'alpha': 0.21544346900318834}\n",
      "测试时间：0.000秒\n",
      "训练集准确率:88.95%\n",
      "测试集准确率:86.62%\n",
      "\n",
      "\n",
      "分类器: KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',\n",
      "           metric_params=None, n_jobs=1, n_neighbors=5, p=2,\n",
      "           weights='uniform')\n",
      "5折交叉验证的训练时间为:95.239/(5*14)=1.361秒\n",
      "最优参数为: {'n_neighbors': 1}\n",
      "测试时间：0.389秒\n",
      "训练集准确率:99.97%\n",
      "测试集准确率:80.61%\n",
      "\n",
      "\n",
      "分类器: MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)\n",
      "5折交叉验证的训练时间为:0.325/(5*10)=0.007秒\n",
      "最优参数为: {'alpha': 0.21544346900318834}\n",
      "测试时间：0.000秒\n",
      "训练集准确率:91.67%\n",
      "测试集准确率:87.07%\n",
      "\n",
      "\n",
      "分类器: RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
      "            max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
      "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
      "            min_samples_leaf=1, min_samples_split=2,\n",
      "            min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=1,\n",
      "            oob_score=False, random_state=None, verbose=0,\n",
      "            warm_start=False)\n",
      "5折交叉验证的训练时间为:17.168/(5*6)=0.572秒\n",
      "最优参数为: {'max_depth': 9}\n",
      "测试时间：0.044秒\n",
      "训练集准确率:72.82%\n",
      "测试集准确率:72.27%\n",
      "\n",
      "\n",
      "分类器: SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,\n",
      "  decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',\n",
      "  max_iter=-1, probability=False, random_state=None, shrinking=True,\n",
      "  tol=0.001, verbose=False)\n",
      "5折交叉验证的训练时间为:477.596/(5*9)=10.613秒\n",
      "最优参数为: {'C': 100.0, 'gamma': 0.001}\n",
      "测试时间：0.864秒\n",
      "训练集准确率:97.47%\n",
      "测试集准确率:89.49%\n",
      "\n",
      "\n",
      "分类器: LinearSVC(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
      "     intercept_scaling=1, loss='squared_hinge', max_iter=1000,\n",
      "     multi_class='ovr', penalty='l1', random_state=None, tol=0.0001,\n",
      "     verbose=0)\n",
      "5折交叉验证的训练时间为:15.466/(5*10)=0.309秒\n",
      "最优参数为: {'C': 10.0}\n",
      "测试时间：0.001秒\n",
      "训练集准确率:99.87%\n",
      "测试集准确率:87.07%\n",
      "\n",
      "\n",
      "分类器: LinearSVC(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
      "     intercept_scaling=1, loss='squared_hinge', max_iter=1000,\n",
      "     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,\n",
      "     verbose=0)\n",
      "5折交叉验证的训练时间为:17.259/(5*10)=0.345秒\n",
      "最优参数为: {'C': 10.0}\n",
      "测试时间：0.001秒\n",
      "训练集准确率:99.85%\n",
      "测试集准确率:86.97%\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# # 不同分类器比较\n",
    "print(\"分类器的比较:\\n\")\n",
    "clfs = [\n",
    "    [RidgeClassifier(),'Ridge'],# 线性分类器-最小二乘+L2正则\n",
    "    [KNeighborsClassifier(),'KNN'],# K临近\n",
    "    [MultinomialNB(),'MultinomialNB'],# 朴素贝叶斯\n",
    "    [RandomForestClassifier(n_estimators=200),'RandomForest'],# 随机森林\n",
    "    [SVC(),'SVM'],# svm:采用svc(),损失是hinge损失的绝对值\n",
    "    [LinearSVC(loss='squared_hinge',penalty='l1', # 线性可分支持向量积,平方Hinge损失+L1正则\n",
    "               dual=False,tol=1e-4),'LinearSVC-l1'],\n",
    "    [LinearSVC(loss='squared_hinge',penalty='l2', # 线性可分支持向量积,平方Hinge损失+L2正则\n",
    "               dual=False,tol=1e-4),'LinearSVC-l2']\n",
    "]\n",
    "\n",
    "# # 训练数据保存到列表\n",
    "result = []\n",
    "for clf,name in clfs:\n",
    "    # 计算结果 \n",
    "    a = benchmark(clf,name)\n",
    "    result.append(a)\n",
    "    print('\\n')\n",
    "# # \n",
    "result = np.array(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型比较"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAArYAAAHeCAYAAABuTEhGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzs3XlYVdX+P/D34TBPInoFZ3M29GspYWomCjmTYo7lLKKl6dXudSSHNMOhwfGaOaRmYY7kTSvNyKksEEmcZXBkFgwB4Qyf3x/82Jcjh0klaft+Pc95Hs/ea6+99gbkzTprr6UREQERERER0d+cxZNuABERERHR48BgS0RERESqwGBLRERERKrAYEtEREREqsBgS0RERESqwGBLVEY6na7C6s7KysLx48fLXD4sLAzr1q1T3huNRvz555/Iy8szW16v1yMrKwvZ2dlFzluSDz74AP/+979R3slT9Ho9PvnkE/zyyy/lOq4477zzDiZOnPhQx4oIQkNDYTQaAQB//vknxo8fj59//rnMdSQkJJi8P3z4MBYvXmyybcuWLcjIyHioNpZFbm4ucnJyYDAYynWcwWBATk5OBbXqf9RynwEgNjYWqampZvf99ttviI+PL3NdBw4cwNq1a4v87AH5Pyfz58/H119//bBNJaIHCRGVSqfTiY+Pj4wePVrS0tIkNjZWIiIi5I8//pBz587JhQsXTF7nz5+X6OhoOXPmjPz++++l1v/WW2+JlZWVpKSklKk9c+bMEQCyfv16ERG5cOGCACj1NX78eJNrev7552X8+PGi0+mKnEOv10vt2rWlevXqcu/evTLeqXwGg0G8vLykZcuWZusur169eom3t/dDHbt7924BIFu3bhUREaPRKM8995x07dq1TMffvXtXatSoIVOmTFG2LVq0SGrVqqW8//HHH0Wj0ci0adPM1pGYmCjh4eESHR1d5HvF3Cs9Pb1IHfPmzSvT19jcy8PDQ6nn+++/f+h6tm3bVqnv8/Tp00Wj0YhWqxWtVisWFhZStWpVERFZuHChAFD2abVaASBbtmwpUo+Tk5O88847Zs/h4OAgU6dOLdM1iYiMGjVKqlevLrm5uWb3jxgxQmxsbOTs2bNlrpOIimdZUYGZSE10Oh3atWuHDz/8ED/88AM6deqEffv2wcbGBhYWFsjMzISVlRXs7e2Rnp4OOzs7WFtbQ6/Xw9LSEnfu3AEA3Lt3Dzdv3ixSf7t27bB27VqsXbsWgwYNApDfA1bQ29aqVSvY2toq5RctWoSbN2/izTffRMOGDdGpUyfExsbC1tYWWq0WGo0GzZs3x/jx4zF16lQYDAbodDqTOiwtLTFv3jwMHz4cMTEx2LlzJ1xcXJT9O3fuREpKCpo0aYL333+/SM9ZSSwsLLBmzRp4eXlhzZo1mDJlSrnveWG2trbIzc19qGMXL16MZs2a4fXXXwcAaDQaLFmyBN27d8eWLVswcuTIEo93dnbG2rVrMWTIEOTl5WHNmjXQarWwsrICAFy+fBmDBw/Ga6+9hqVLl5qtY9++fXj77bdhY2MDEUFWVhacnZ2h0WhMymVlZcHa2hoXLlww+VoAwMCBA9G6dWvY2NjA2traZF9UVBT+9a9/YcGCBejQoYPJvoLvwQL29vYAgHfffRfPPvtsiddeICkpCf/85z9RtWrVYstUhvtsZ2eHrl274vDhwwDyP9kYNmyYsq9jx44mn4w0btxYuR+F2djYwM7Ozuw5rKysTH6OinPkyBG4urriwIED8Pb2xtWrV5Gbmws3NzecO3cOGRkZsLW1RY8ePZCamorLly8jPj4eOTk50Ol0yn0konJ60sma6O8kOjpavvrqqyLbO3bsKPPmzRMRkdq1a8vmzZvNHv/tt98KAHFycpIqVaqYvPD/e8UKb7OzsxOtVitXrlwpUldOTo4EBAQovbzp6ely//59ZX+1atVk4cKFyvtDhw6JXq8vUk9kZKS4urqKj4+Psi03N1eaNGkikyZNklOnTomVlZUcOXLE7DVlZGRIXl6eGI3GIvumTp0qX3/9dZHtRqNRcnNzJSMjQ9mWl5cn2dnZYjAYipR/7bXXTNpXwGAwSE5OTrG9Ydu2bRMAsmfPniL7BgwYIM7OzmbvrUh+j/a9e/ckLy9PRERWr14tQ4cOFYPBIB988IHUr19fRES++OIL6dChg+Tk5CjtycnJMVunyP96TLOzs022Z2RkSNWqVWXu3LnFHlscHx8fad68udmv74NOnjwpAOSnn34qc/1XrlwRAPLDDz+Y3V9Z7vPcuXNNvk9++uknqV27toiILF26VDp27GhSvlGjRma/P2vUqKH8PD/oH//4R5F9Z8+elTfffNPkZ8DKykrs7OyUn3dHR0extraW4OBgGT58uLRu3VratGkjbdu2NXk999xz8txzz5k9NxGVjsGWqAwyMjKKfKR+7do1iYuLk7i4OPH09JQpU6ZIXFycuLu7y7Jly5R9WVlZyjFHjhwRAHLp0qUi5/Dw8JD+/fs/VPvu3LkjVatWldWrVyvbCgfbqKgoASBvv/222eMjIiJM2vSvf/1LXF1dJTU1VUREZsyYITVq1DAbTvCQH2sDEBsbG6WezZs3P3Q9hQN8gRs3boiLi4t06dLF7DWnpKRIvXr1pGHDhhIXF1dk/8GDBx+6PSNHjjR7ThGRkJAQqVatWpHtQUFBUr9+/SKBtzQ7duwQALJhw4Yylf/ll1+UYHv9+nU5c+aMnD9/3uyQiD/++ENiY2OVYHvo0KEi9VWm+zxr1iyToQZarVaqV68uIiLvv/9+kX1arVYZOlGYm5tbscH2wX0bN24UBwcHsba2LvLH37x586R27dpm/+gTyf8jp/C+b7/9tkx/nBBR8RhsiUqh1+vF29tbWrVqJceOHVO2Ozs7l+mX765du5RjCnrLzAXbNm3ayBtvvFFke25urvLLLjU1Vf744w8leNy6dUspN2jQIKlfv74SwAsH2zlz5ohWq5WrV68q11Q4cBdWMI7x888/N2lD+/btpVatWnLhwgWT8tu2bZP//ve/8sMPP0jnzp2lUaNGsnPnziKvfv36ia2trXz99dfy1VdfybZt22TTpk1KPXfv3pWYmBi5ceOGJCQkSEJCgkRFRYmlpaXY2Ngo4eHKlSvK/lu3bklcXJykpaWZtCkvL0+8vb2lSpUqEhsba/Y6RfLHJteqVUtq1Kgh33//vcm+9PR0iYyMlNjYWElOTpaUlBT59ddfxcXFRWrUqCF169aVlJQUGTlypOzfv19SUlIkKSlJ4uPjJTExsdhzLl26VNq2bWuy7caNG2Jvby979+4t9jhzkpKS5B//+IcAUI6Nj48v8ZjCwbYs43YnTpxYbLCtzPe5QEJCgvTr10+++OIL5f5Mmzat2F5+EdPwWtA7XPDpQsG+kydPyksvvSQA5NVXX5XLly8XqefZZ5+VGTNmyJYtW0z+6BQR2blzpwCQFStWiEj+JyoAZMyYMaVeExEVj8GWqBQGg0HWrl0rTk5OotFo5L333hMRkerVqytDDp577jlZtmyZiJgORQAg33zzjVJXQagw90uwXbt2xYaLgwcPiojIV199ZbK9cBA+fvy4AFA+Wi0Itnl5eVKzZk0ZNGiQUjYyMtKknjlz5ijtc3Fxka5du8ovv/xi0r7U1FRp2rSpVKlSRdauXWu2F2rVqlVib29v9oGx7t27S+fOnUu81w969913xc3NTfr06SNdunSRunXrmoRhc/R6vbz++uui0Whk69atxQ6TKHDt2jVp3bq1aDQaGTFiRLEP8EVGRkr9+vWlS5cu8t5770n9+vXlzz//FC8vL9FqtTJz5swSw1KBoUOHiouLi8l5/P39i3zNf/vtt1Kvs0uXLuLm5qYE28jISLG2ti7xHhUOtrm5uSXeH51OJ/fv3zcbbCvbfU5MTJRz587J1atXlU9L4uLi5MCBAwJAli9fLnFxcXL06FFp2LCh7N27V2JiYuTChQuSkJBgUlfBPX3wlZmZKW5ublKtWjUBIJ07dzb5Y7ew8PBw0Wg0cvnyZXn99ddNHqI7ffq0ODo6iq+vr8nPyqJFiwSATJ8+vdj7SEQlY7AlKqP4+Hjp2rWr8nGqm5ubbN68WfR6vbi4uChPjD8YbA8cOKDUURAqCnpOC2vXrp3069evyOwKkZGRylPyOp1OCRCdO3eW0aNHK8cbjUbx9PRUZkooCLa//fabODo6yqlTp5Sy9+7dk/DwcKUnbcGCBfLbb7+JtbW1dO/eXWbNmiX/+Mc/TNqXmJgop0+flg4dOoibm5vZnsHY2FjRaDRFPpL9888/xd7e3uyQgeJkZGRI9erVZeHChcoY29mzZ0vz5s3NjsMtMHnyZAEgH3zwQYl/LBR+Xb9+XcaOHSu+vr5FQtP9+/dlyZIlYmtrKz179pSsrCxZuHCh1KxZU0TyvyZBQUGi1WqlTZs2ZnvjC2vatKlYW1vLuHHjRCT/61Y4iA0YMECaN29eYh1Go1FGjx4tdnZ28t133ynB1mg0KmGz4PvgQYWDrYhIdna2yfkLvwqGRZgLtpXtPgcHB4utra24uLhItWrVlJe1tbVyfmdnZ5N9Li4u4uDgIAsWLDCpy83NTQYOHCg7d+6UHTt2yNatW+Wzzz4TnU4nbm5u4unpKcePHy/x6+Pr6yve3t6i0+lk+PDh8sorr4hOp5OsrCxp0KCBPPvssyZjzAs+lRk5cqS0a9dO7t69W9K3ABEVg8GW6CG5u7vL5s2bZc+ePQJA+Yi+QYMGMnnyZImJiREAJh+9hoWFCQC5du1akfratWtX4tjMB3Xu3FmZvis7O1vu379vEvgKD0XIyckRo9Eo9+/fVx7SKVC/fn354IMPRERkxYoVkp2dLcHBwVKvXj2lzJUrV6RZs2by0ksvSWZmppw5c6bYdvn4+BQZUvHZZ5+JVquV69evl/n6xo0bJ66urnLnzh0l2CYkJIizs7N89NFHxR6XmpqqjDc9d+6c/PHHH/Lll18KAPniiy9M/nAYP368yTjfwmErKytLgoKCpF69emJrayvLli1T7u+cOXOKjJM9cOCAODs7i6OjY7HXmZKSIhYWFvL111+LtbW1/Pjjjyb779y5Iw4ODvLhhx8We316vV7Gjh0rWq1Wdu/eLQkJCSZDEfLy8qRnz56i0WjMjrt9MNgWfARu7vXzzz+LiPlgW5nvs0j+9/zy5cvFyclJNmzYIAEBAVK/fn3Ztm2bZGZmFnucSPnG2JqTkpKiTCf24GvGjBkSEREhGzduFC8vL+Xn8aOPPpJWrVpJREREkZ9RIio7BluicijcS1SrVi1ZuXKl1KlTx+Qp5qCgILGyslJ+kRUOA//9738FgNy+fVtu375t8su/devWRXpsL1y4UOzDRIWDbffu3cv8wM2DMzYUDrYFli1bpjyNvn37dnFxcZGaNWsqvVYl2b9/v2i1WomKihKR/KDVsGFDGTBgQMk3t5BDhw6JRqORTz/9VERMZ0X46KOPxNbWVk6cOFHm+goesHrwD4px48bJM888U+xx/fv3lxEjRhTpnf7mm2/E09NTYmJiTLafPXtWPv7442LrW7NmjTRp0kRERGbPni3169c3mbN21qxZ4uLiUmxvXUZGhvTs2VO0Wq0y/+qDwVYkv0fey8tLLCws5MsvvzSp48Fg+/PPPwsAuXHjhlKmoM7IyEgRMR9szXlS9zkrK0u+/vprWbduncyfP19effVVqVKlinTv3t1kSM0XX3whDRs2FCsrK/Hy8pLBgwfLpEmTZNasWSYPbT1qsBXJH08fGxsrCQkJMmTIEOnataukpKQoY9uPHj0qAOTo0aMikj9G3tXV1WRmEyIqP85jS1RGBoMBXbp0Qd++fbF27VpoNBrY2dnh3//+N/7v//5PKbdw4UIsXLgQAIrMU1own62TkxMmTpyIrVu3muyPiorCvn37TLadOnUKXl5eJbZt48aN0Ol0sLGxUc7p4eGBcePG4Z///CeA/NXJdDpdiXORFnb37l107twZv//+O/75z39i1qxZcHJyKvW4Pn36wNvbGwEBATh69CiWL1+OGzdu4LvvvivTeaOiojBgwAB07NgRAQEBRfa//fbb+Oabb9CnTx/s378fHTt2LLXOiIgIuLi4oG7duibbk5OTUbNmTbPH3L59G0FBQbCxsUFOTg4uXryo7Dt9+jROnz6Nu3fvmmzXarXo0qULIiIi0LZtW5P6DAYDVq1ahdGjRwPIn0f2v//9L4YPH47Q0FBcvnwZH3/8MYKDg+Hs7FykPUeOHEFAQAASExOxe/du9O3bt9jrdXBwwP79+/Hiiy9ixIgRcHFxQc+ePc2WffB7tDCtVlvsPnOe1H22t7fHxo0bcevWLbRs2RLdunXDsmXL0LBhQ+X6RASDBw/G4MGDER0djR9++AEXL17ElStX0K1bt3Jfa2nat2+v/NvGxgZWVlaoXr26su2ll15CnTp1sG/fPrRr1w6HDh3CsGHDYGNj81jbQfS0YbAlKqNvvvkGt2/fVn45iQhSUlLg7+8PACa/eItz/fp12NjYwNHREZaWlkUmjC8sLCwMXbp0KXUy+Pj4eLz//vv47LPPTLZrNBo4OjrC3d29LJcHADh37hw8PDwA5C+J+swzz2D79u2oU6dOmesAgPXr1+OFF15A7969cfToUcydOxdNmjQp9bgzZ86gR48ecHZ2xqZNm5SwXpjBYMCWLVvg5+cHHx8ffPzxx5gwYUKJAe3gwYPo2LFjkTLJycmoXbu22WM2bNiAJUuWwNraukjoycjIgNFoxMsvv2zSPqPRiNzcXGg0Gty7d8/kmC1btuDWrVt48803AeQvOrFv3z688MILCAgIwO+//462bdvi7bffNtueLVu2QKPR4MSJE3j++eeLvdYCNWrUwP79+zFixAiTP7weVLBEb0xMjNLmguVki1uiuThP8j5/9913mDVrFoKDgxESEqKUdXV1RVpaGtasWWP23n7xxRd44403Sr22ffv2wc/Pr9RyhYkIbt26hbS0NMTGxmLcuHGIiIjAkiVL8MorryAgIACrVq1C8+bNkZ6ejgkTJpSrfiIy4wn3GBP9LRiNRmndurVUq1ZN/vzzTxHJH4qAMnz0X/jj29dff11atGghIiLjx48vMmF8YT/99JMAkHPnzpnd37lzZ+nVq5c0bNhQPDw8iix7++ACDcUpGIoQEREhzs7OcuzYMZOhCA9r9uzZAkDs7e2LfJRszsqVK8XGxkZcXV3lzJkz0qhRo2LvqY2NjSQmJkqHDh0EgPTs2bPYB8p+/PFHAWB2vtJGjRoVO7dvcWJiYsTKykoaNmwozz33XKnjNQuOqVKliixdurTIvoLFDVDKTAjZ2dlml9o1NxShsAfvy4NDEUqaR7ZgQYayDEWoDPd5/vz54u3tLZmZmZKZmSkHDhxQFmj49NNPpX379sq+zMxMadiwocl0fAUKDzc4ceKEdOzYUQDIxYsXix2K8PPPPytjh0+dOiXNmjUTGxsb5V66ubnJ1KlTZfPmzXLnzh0REUlOThY7OzuxsrKSPn36lOv+EJF5FhUdnInU4D//+Q+ioqIwbdo05eN4o9GIzZs3Q/LHqpt9FZQrcOTIEbRq1QpA2T/mNdcTKSJISEjAgQMH4ObmhqNHj8LBweGhr+/XX3+Fj48PateuXWrv7IM9kQ/KyMjApEmTsHjxYowbNw7NmzfH888/j1WrViE7O9vsMffv38fPP/8MV1dXHD16FK1bt8aPP/6IGzduICUlBb1790anTp2QkpKCGzdu4Pz583Bzc0NYWBgmT56M2bNnw8Ki6H9nmZmZmDhxIpo0aYIhQ4YU2Z+SkoJatWqVeD0PXtuQIUPg6emJyMhIaDQa9OnTB7du3Sr2mOTkZPTv3x8tWrTAtGnTlO05OTlYsWIFJk2ahNatW6NJkybw8fHB7NmzkZCQUKQeOzu7IsvsloW5+1JYjx49iv3+feWVV8p0jspwn4H8a9VqtXB0dISjoyPs7OyUn7OC5a8L9jk6OkKj0RT5OTQajcjOzkZUVBReeeUVdOzYEXl5edi7dy+aNm1q9rxGoxH9+vWDr68vAOD5559HgwYNEBQUhJMnT2Lo0KFo3bo1PvroI4waNUoZDmRhYYHatWtDp9OhV69eZb4/RFSCJ5Gmif5Orly5Ig4ODlKlShWT6Xlq1KhRph7bgum+CnrGCiZqnzBhQpmO/+OPP4q0qeAhtC5duhS70IKrq2upPba5ubni5OQkAOTll19WepKWLVtmMitCgZycHGUO0AdFRUXJO++8I05OTlKnTh2Tp/Tfffddsba2lqpVq8rbb78t3333XZHlUO/fv1/s4gJ9+vSRl19+ucRredDdu3elS5cuYm1tLb/++muR/bGxscoT/KXR6XSya9cuadSokTz77LNy+/ZtEcmfGcDHx0eqVKkiy5cvV+5fgaSkJGnYsKG4u7vL9evXJScnRw4fPixvvfWWVKtWTapUqSLvvfeespzwnDlzlGWUu3fvLmvWrFE+ISjO9evXBTC/nK05D/bYlsX58+eL7bGtDPe5wIIFC0Sj0SirillYWCifPBSsbFd41TGY6ekuuD8A5KWXXiqyjHDDhg2lW7ducu7cOeUBz48//lgAKHNZP2jYsGHi6+urvM/NzZXNmzdLzZo1pUaNGtKtWzfRaDTSt29f2b9/Px8gI3oEDLZEpdDpdDJo0CCZMWOGyfZq1arJ4sWLzS5FWvACIKGhoSKSP3RAo9EoH8uPGzdO2rZtK1euXDH7+uKLLwSAhIeHm21XcHBwsaFWRMTe3r7I/JwPOnHihPILtfAv05UrV4qlpaUcOXJEuZazZ8/Km2++KQCUJ/JFRN555x1p0KCBAJAaNWrIwoULiwyLEMlfXWvy5MlSpUoVJWAMGTKkxEn9C3Tv3l06dOhQajmR/GEjX3zxhdSpU0dsbGyU+1/g22+/lWHDhknjxo3F2tq6yOT8Ba5duyZbt26Vt956S9zc3MTKykomT55c5Nr0er0sXrxYnJycxM7OTl577TWTabamTJkikZGRotfrpU2bNmJhYSGdO3eW1atXmx1acOPGDZkyZYo4OTmJj49PiXP2iohcvXpVgP8tzFGagum9yhJsU1JSxN/fXxo3biwATGaiqGz3WSR/CduC2TNE8ofzFAxFWLduXZGhP40aNZLdu3cXadPIkSPNbhcRWbt2rfI9XPCysLCQXr16FfvzOHjwYPH29haj0SiTJ09WVot75ZVX5Nq1a2I0GmXt2rVSvXp1ASC2tralLtBBROYx2BKVQU5OTpGVkpydnYtMnfUgALJz504REfnjjz9MFlQYMWJEmcbYnjx58qHarNVqZdasWaWWCw0NLTKFV1xcnDRr1szs2NbRo0ebhK0jR45Iv3795MsvvyxTT9P9+/clNDRUAgMDS13MoEDXrl3Fy8urTGUTExPFw8NDPDw8zM63m5CQIK6urtK9e3dlqiVzwsLCRKPRSPPmzWXevHly8+bNUs87e/ZsqV27tgQHB5stU9ZlYEXyp7BKSkoqtVxBb2rBAiGlKVjKtaw9tv/3f/8n1atXl2HDhpl8fSvjfZ4xY0aRYFu1alURyf9jzVyw3b59e4nnexz69++v/GH29ddfS+fOnYv0BIvkT9O2bt06k1UCiah8NCL/fyAgEf2lsrOzYTQa4ejo+KSbojoJCQmoWrVqqTNKlCY5ORk1atQo1zFGoxEGgwFWVlaPdO7KIi8vD9bW1mb3Vbb7nJ2dDb1eb3a6NHPS09Nhb2/PKbaIVITBloiIiIhUgbMiEBEREZEqMNgSERERkSow2BIRERGRKlT4kroWFhaws7Or6NMQERERUSWSk5NjskjRX6HCg62dnR2ysrIq+jREREREVIk8yoqYD4tDEYiIiIhIFRhsiYiIiEgVGGyJiIiISBUqfIytOXl5eYiJiUF2dvaTOD09wN7eHo0aNSp2dSEiIiKiv4MnEmxjYmLg4uKCZs2awcKCncZPktFoRGJiIs6cOQN3d3fUq1fvSTeJiIiI6KE8kVSZnZ0NNzc3htpKwMLCAu7u7rCwsMDevXuRmJj4pJtERERE9FCeWLJkqK08LCwsoNFoAAA3b958wq0hIiIiejhMlw8hNzfX7PaMjAyICAAgOTn5kc4RExPzSMc/DAsLC+h0ur/8vERERESPwxMZY/ugmde2PNb6guuPLFO5NWvWYNSoUSVOIJyUlITly5dj2bJlAIC0tDQMGzYMBw8eLFJ27NixmDx5Mry8vDBkyBCEhobCycnJpMyFCxcwZ84c7NmzBwAwYMAAfP3115g6dSqmTp2KBg0aAABmz56N3r17w8vLCz169EDjxo0BAOnp6Rg1ahTefvttAIBOp4OVlRUAYMWKFXBzc8OQIUMAAHq9HpaWleJLTERERFThnurU07x5c/Tt2xehoaHYtGkTWrVqhby8PLz44ouYPn06FixYgM2bN6Ndu3Y4ffo0li1bhuTkZGRmZmLUqFHIy8vD/Pnz0bRpU2zfvh1HjhxBWloaDAYDUlNT4efnp5wrLCwMABAUFIRr165h7ty5yMzMxMmTJzFo0CBEREQgOjoaq1evRosWLbBx40asWLECHTp0wJAhQxAcHKzUc/HiRaXeDh06wM7ODhYWFrh+/Trq1auHdevWQUSQnZ2No0ePckljIiIieio81cHWx8cHzs7OsLGxgdFohMFgQGhoKJo2bQpLS0vodDpERkZi0qRJ6Nu3Lw4fPowBAwYgNDQUjo6OSj3h4eEIDg6Gh4cH+vTpg8zMTOzevRt9+vSBXq9Hv379AAAbN25E48aNsWXLFlhYWMDe3h5xcXHYtWsXAgICEBQUhAYNGsBgMMDR0RFz5szBpUuXEBISgvDwcAD/67Et0KNHD3h5eSE1NRUxMTFo2LAh6tati1u3biEmJoahloiIiJ4aT22wXb9+Pfbv3w9fX1+88MILyvaCh6gAICQkBGlpaejfvz/8/f3RqVMnJCcno2PHjggODkbPnj0BAC1btkRISAj8/Pxw/Phx5ObmIi0tDcePH4dOp0PdunXRvHlz9O7dG1evXkWrVq0QFBSE3bt3Iy4uDn369MG1a9cQEBAAPz9LdTCfAAAgAElEQVQ/ODs7Y9myZZg2bRoGDhyIU6dOQavVKu3Ky8tT/j179mzEx8dj4cKFyMrKQuPGjbFjxw4cOHAA8+fPV8rl5OQgICAA27dvr8C7SkRERPTkPLXBNjAwEM8++yx++OEHAPnB78HxqCNGjECPHj2wePFiTJo0Cbdv30bfvn3xyy+/wMrKShnDamtrCxsbG9StWxe+vr7IzMzE7du34evrC+B/QdTe3h7t2rXD2bNn4eHhgX79+iEuLk45n4ODA1q0aAEASv1XrlzB+fPn8dtvvyE9PR09evSATqfDyJEjodVqcfHiRezYsQNLlizBxo0bMX78eIwbNw4uLi7IyMjAmDFjAAB2dnYMtURERKRqT22wLVAw7VhsbCxee+01k30ajQbLly/HtGnTEB0djejoaERFRSE+Ph5OTk5wcnLC3r174ejoCDs7O/j7+6Nly5Y4duwY+vfvj5YtWwKA8nBXdnY24uLikJmZiZiYGMTFxeHo0aPo3LkzgPyhCocOHTJpw/Lly5Geno6bN29CRJCWlob79+/DaDRi4MCB+PTTT/H+++8jICAAd+/exfLly2FlZYVFixbhq6++Ql5eHlcUIyIioqfCUx9s7969iyNHjuDy5cto0qSJyb5Lly7h0qVL2Lp1K3x8fLBx40aEhobi3r17qFKlCqysrODo6Ijo6Gi89dZbsLCwwJo1a5CRkQGtVotNmzbhmWeegU6nw6pVq9CmTRv4+/sjKioKNjY2EBH89NNPuHDhAgCYDDcAgNDQUAQFBcHDwwO+vr6wsrJCSEgIXFxcAACbNm1CXFwcxo4dCwBISUlBZmYmateujSlTpkCn06F69eoICAj4C+4kERER0ZNVKYJtWafnetxyc3OxefNmuLu7w8PDAwCUeVxFBHXr1sXixYtx6NAhfP/99/jss88wa9YsnDp1CrVr18bSpUvRunVrtGrVCtu2bUNISAjOnDkDb29vaDQaREdHIycnB0OHDkXz5s2Vcx48eBDVqlXDCy+8gJkzZyoPg/Xu3Vtp29WrV3H//n24u7tj8ODBmDx5Mho0aID+/ftj6dKl8PT0xJgxY5ShBgDwySefwN3dXZnuqzCOsSUiIiK1e6oXaEhNTUVwcDBiY2MxZ84cAPnhsF69ejAajTh+/DgOHjyI4cOH48UXX8TEiRORmJiIzz//HA0bNsTHH3+MCxcuIDY2FmPHjkWjRo3w5ZdfwtraGnl5eVi5ciXeeecdfPvtt7h8+TIAICEhAdOmTcOHH34InU6H4OBgeHt7w9vbG3/++afStlGjRmH16tUYPXo0Zs6ciUGDBsHLywuffvoplixZokwfVlhubi70er3Za+UYWyIiIlI7jRQslVVBHBwckJWVZbItIiICbdu2rcjTVpisrCxotVrY2to+6aY8VhERETh58iQ8PT3Rvn37J90cIiIi+pszlwErWqUYivB3UtIqZURE9BfY8hhWq4yPf/Q6AGDevMdTDxE9Fk/1UAQiIiIiUg8GWyIiIiJShac22GZmZha7LzY2Funp6Wb3JSUlAcifNaGwjIwMZVtycvIjtS0mJuaRjiciIiJ6Gj2VwfbevXvw9fVFWFgYBg4ciFGjRmHw4MGIjIwEkD8/7OnTpzF79myT5WsBoGfPnvj2228xZcoUk+1jx47F0aNHkZOTgyFDhpgNzhcuXED//v2V9wMGDIDRaMSUKVMQX2i81+zZs7F161ZcvHgRDRo0gK+vL3x9fdG2bVusWrVKKVcwNRkArFixAiEhIcr74mZHICIiIlKrSvHw2BLN461vRinzPDg6OuLAgQNIT0+HVqvF+++/j+3btyM1NRXdunVDhw4doNVq0axZM/znP//BxIkTleV2HR0d0bt3b8THx0On08HKygrbt2/HkSNHkJaWBoPBgNTUVPj5+SnnK5iaKygoCNeuXcPcuXORmZmJkydPYtCgQYiIiEB0dDRWr16NFi1aYOPGjVixYgU6dOiAIUOGIDg4WKnn4sWLSr0dOnSAnZ0dLCwscP36ddSrVw/r1q2DiCA7OxtHjx6FnZ3d4725RERERJVUpQi2f7Vff/0V6enp6NmzJwBgwoQJaNWqFaysrEyWnx05ciT+/PNPeHt7K9N7nT17Fr6+vjAYDOjVqxfS0tIQHBwMDw8P9OnTB5mZmdi9ezf69OkDvV6Pfv36AchfLrdx48bYsmULLCwsYG9vj7i4OOzatQsBAQEICgpCgwYNYDAY4OjoiDlz5uDSpUsICQlBeHg4ACA9PV1ZzAEAevToAS8vL6SmpiImJgYNGzZE3bp1cevWLcTExDDUEhER0VPlqQy2Hh4e6N+/Pxo3bgwgfylbZ2dnkzIxMTGYMWMGZs2ahZ9//hkWFhbQ6XR49dVXsW/fPiXo1qxZEyEhIfDz88Px48eRm5uLtLQ0HD9+HDqdDnXr1kXz5s3Ru3dvXL16Fa1atUJQUBB2796NuLg49OnTB9euXUNAQAD8/Pzg7OyMZcuWYdq0aRg4cCBOnTplstRu4aERs2fPRnx8PBYuXIisrCw0btwYO3bswIEDBzB//vyKv5FERERElchTGWydnJzwzTffwNLSEiICnU6n/LtAo0aN8Oabb0Kn02Hjxo2YO3cu2rRpgzp16uDmzZtKKLa1tYWNjQ3q1q0LX19fZGZm4vbt2/D19QXwvyBqb2+Pdu3a4ezZs/Dw8EC/fv0QFxennM/BwQEtWrQAAFhZWUGv1+PKlSs4f/48fvvtN6Snp6NHjx7Q6XQYOXIktFotLl68iB07dmDJkiXYuHEjxo8fj3HjxsHFxQUZGRkmy+0SERERqd1TGWwBIDQ0FJcvX4alpSUyMzPh5OSkhNzCNBoNAgMDcfLkSSxfvhzbt29HZGSkEmyB/OVq/f390bJlSxw7dgz9+/dHy5YtAeSHVADIzs5GXFwcMjMzERMTg7i4OBw9ehSdO3cGkD9U4dChQybnXr58OdLT03Hz5k2ICNLS0nD//n0YjUYMHDgQn376Kd5//30EBATg7t27WL58OaysrLBo0SJ89dVXyMvLMxlaQURERKRmT2WwNRqNWLlyJXbt2oUzZ85gz549iIuLw7hx42A0GnH8+HGlbGJiInbs2KG8f/HFF7Ft2za8+uqrsLGxQXR0NN566y1YWFhgzZo1yMjIgFarxaZNm/DMM89Ap9Nh1apVaNOmDfz9/REVFQUbGxuICH766SdcuHABAEyGGwD5wTsoKAgeHh7w9fWFlZUVQkJC4OLiAiB/5oa4uDiMHTsWAJCSkoLMzEzUrl0bU6ZMgU6nQ/Xq1REQEFDRt5OIiIioUngqg+2GDRvQoUMHODk5Ye7cudi3bx82bNiAdevWwc/PDxs2bICPjw+ysrLw4YcfIjAwEJGRkRg6dCi2bduGqKgobNiwAc7Ozhg+fDi2bduGkJAQnDlzBt7e3tBoNIiOjkZOTg6GDh2K5s2bAwByc3Nx8OBBVKtWDS+88AJmzpypPAzWu3dvpX1Xr17F/fv34e7ujsGDB2Py5Mlo0KAB+vfvj6VLl8LT0xNjxowxGWrwySefwN3dHUOGDPlL7yURERFRZVEpgm1p03M9bqNGjUJ2djaSk5Mxb9481KlTB7Nnz0ZCQgKSkpLQpk0beHl5ISkpCYcPH0ZSUhIGDhyoDD+YOnUqhg4dihMnTiA2NhaBgYGYMGECpk+fjs8//xxZWVlYuXIlLl68iM8++wyurq547rnnkJCQgGnTpmHGjBnYtm0bgoOD8fnnnwMADAaDSfuGDx+O0aNHY8GCBWjfvj0A4NNPP8Xs2bMxceJEeHt7m1xTbm4u564lIiKip5pGHlxC6zFzcHBAVlaWybaIiAi0bdu2Ik9b4QrG5apFREQETp48CU9PTyVIExFVSlu2PHodhRbFeSTz5j2eeohUyFwGrGhP5cpjj4OaQi0RERGRGjDYEhEREZEqMNgSERERkSow2BIRERGRKjDYPoTc3NyHPjYtLQ1fffUVAECn06GCn90jIiIiemo81cF2zZo1pT6tl5SUhH//+9/K+7S0NPTr16/M51ixYgXWrVunvHd0dMTMmTPxxx9/YMSIEfDx8YGvry98fX1RtWpV3L17t/wXQkRERESVYx7bxzJ1S2EjR5apWPPmzdG3b1+EhoZi06ZNaNWqFfLy8vDiiy9i+vTpWLBgATZv3ox27drh9OnTWLZsGZKTk5GZmYlRo0YhLy8P8+fPR0JCAl5//XU0adIEFy9eRGJionIOS0tLZVldg8GAO3fu4KOPPoK7u7vSc1vA29ubS+ASERERPaQyBdukpCQMGDAAx44dg06nQ//+/XHnzh2MHTvWZPWrvxsfHx84OzvDxsYGRqMRBoMBoaGhaNq0KSwtLaHT6RAZGYlJkyahb9++OHz4MAYMGIDQ0FA4Ojoq9aSkpMDf3x+rV6/GCy+8gE2bNiE6OhqWlpaIioqChYUFbGxs4OfnhwEDBuDEiRPo3r27yaIM3333HQBAo9H85feBiIiISA1KHYqQnp6OkSNHKh/Zr1q1Cm3btsWJEyewa9cuZGZmVngjK8L69evh5+eHkydPwtLyf/m+cLAMCQlBWloa+vfvD39/f3Tq1Alnz55Fx44dcfDgQaWcVqvF3r174e3tjeTkZIwZMwZTp07F0qVL4e/vj8GDB2PYsGGwsbFRem/1ej0OHz6Mw4cPQ6/Xm7SBiIiIiMqv1GCr1WqxY8cOODs7AwDCwsIwaNAgAMDLL7+M8PDwim1hBQkMDMSMGTOQlpYGAMjJySkSLkeMGIFPPvkE1atXx6RJk/Dyyy9j27ZtGD16NKysrJQlbLVaLfz9/REWFoaaNWsiJycHfn5+JY6XvXjxojK2NioqquIulIiIiOgpUWo3YUGgLZCVlYXatWsDAFxdXZGUlFTkmPXr12P9+vUAoIS/ysrCIj/bx8bG4rXXXjPZp9FosHz5ckybNg3R0dGIjo5GVFQU4uPj4eTkBCcnJ+zduxdGo1E5RkRgZ2eHiRMnlhj6W7RogcOHDwPIH1tLRERERI+m3J9/Ozo6IicnB1WqVMG9e/dMxpoWCAwMRGBgIID8dYIrs7t37+LIkSO4fPkymjRpYrLv0qVLuHTpErZu3QofHx9s3LgRoaGhuHfvHqpUqQIrKys4OjpCr9dj7969iI6Oxq1btwAA48aNAwBcuXLF7HnPnDkDX19fAEBUVFSl/wOAiIiIqLIr93Rfbdu2xfHjxwHkB7IGDRo87jb9ZXJzc7F582b8/vvv8PDwAJA/tyyQ3/Nat25dLF68GI6Ojvj+++8xevRotGjRAkajEbVr18bnn3+Os2fPwmAwKEMRCgJtgcLz1BqNRuV9amqqMsY2PT0dlpaWyMvLU3qQiYiIiKh8yt1jO3LkSPTq1QvHjh3D+fPn0a5du0dvRRmn53rcUlNTERwcjMjISMybNw8A8MknnyizJBw/fhxRUVEYPXo0fv/9d/j5+SExMRGff/459uzZg48//hgXLlxAnz590KxZMwBQ6gGAnTt3YuXKldiwYQMA4N69e8Uu7vDGG29Ar9dzui8iIiKih6SRh1j66vbt2zh+/Di6d++OKlWqlFjWwcGhyCIIERERaNu2bXlPWylkZWVBq9XC1ta21LI5OTnKVF+lyczMhJOT0+No4kOJiIjAyZMn4enpifbt2z+xdhARlepxzH0eH//odQBAoc4MIjJlLgM+aOzYsTh//jx69+6NoKCgIvvT09PxxhtvIDk5GW3btsWnn35aYn0P9bl3rVq1MGjQoFJDrRo5ODiUKdQCgJ2dXZlCLYAnGmqJiIiI/mp79uyBwWDAL7/8gtjYWLPPJW3btg1vvPEGwsPDkZmZWepsXBU+eaqrqyvCwsJMtjHEVU63b99GdHR0scMliIgqBTMPLZdb06aPXgcAPPD7jYj+R6/Xw9PTU3lfeHIBwHQK2W7duuH48eNFHuSvVq0aoqOjkZGRgRs3bqBu3bolnrPCg+2dO3eKTGcVERFR0aelh1CrVi20bNmSQxGIqHKrTEMRhg59PPUQqZClpWWJPawPTiF7+vTpImVeeuklfPvtt1i5ciVatGgBV1fXEs/JR/CJiIiI6C9XMIUskP+AfeF1AQosWLAA69atw9y5c9G8eXNs3ry5xDoZbP8/nU5nckP1ej2MRmOJSwbHxsYiPT29XOdJS0vDV199pZzzIZ7dIyIiIvrbK8sUsunp6crUqqdOnYJGoymxzgofilAmCxY83vrK8JTqsWPH8N5778HW1hanT5/Gu+++i2+//RYnT55Ep06doNfrMXPmTLzzzjtYsmQJ1qxZAwcHB+Tk5GDmzJl4/vnnsWnTJnTt2hVdu3Yt9jwrVqyAjY0NJkyYACD/r5OZM2fCw8MDH3zwAZKSkpS5ayMiIhAfH/9UPpRHRERET5d+/fqhU6dOuH37Ng4ePIiQkBAEBQVh0aJFSplZs2Zh9OjRuHbtGtq3b4+hpQz/qRzB9gno1KkTZsyYge+++w5jxoyBv78/JkyYgO7du2Pfvn1KuQMHDiA9PR1arRbvv/8+tm/fjtTUVHTr1g0dOnSAhYUFfv75Z7z++uto0qQJLl68iMTEROV4S0tLWFlZAQAMBgPu3LmDjz76CO7u7krPbQFvb2/OY0tERERPBWdnZ4SFheHQoUOYPn063N3d0bp1a5MyXl5eOHfuXJnrfGqDLQDY29vj1KlTWL58OX799VdMnjwZzz33HCZMmICTJ09i/fr1SE9PR8+ePQEAEyZMQKtWrWBlZWUSQC0tLeHv74/Vq1fjhRdewKZNmxAdHQ1LS0tERUUpc9n6+flhwIABOHHiBLp37w6DwaDU8d133wFAqV3sRERERGpRtWpVZWaEx+GpDbbbt2/H+vXrISLw9vZGjx490KtXLzg6OqJ9+/a4efMmPDw80L9/fzRu3BgAoNVq4ezsXKQurVaLvXv3Ijo6GsnJyRgzZowyJcW6detga2uLYcOG4f79+0rvrV6vx48//gggv6fW0vKp/VIQERERPRZP7cNjQ4cORVhYGFxcXODl5YVatWoBABITE1G9enUA+fPtfvPNN2jQoAFEBDqdDpaWlkUe+NJqtfD390dYWBhq1qyJnJwc+Pn54e7du8We/+LFi/D19YWvry+ioqIq7kKJiIiInhJPbTdhwQNbAPDBBx/g9OnTiI2NxY0bN1CvXj0lvIaGhuLy5cuwtLRUlr0tCLkFCs+mICKws7PDxIkTS5y7rUWLFjh8+DAAFJnnl4iIiIjK76kNtoVptVrY29ujfv36OHr0KGxsbNC+fXsYjUasXLkSu3btwpkzZ7Bnzx7ExcVh3LhxMBqNyhQVer1eGYpw69YtAMC4ceMAwOzycABw5swZ+Pr6Asif4kKv1/8FV0pERESkXpUj2JZheq6KICJKz6yHhwd+/PFH+Pj4YPjw4Zg0aRI2bNiADh06wMnJCXPnzsW+ffuwYcMGrFu3Dn5+ftiwYQN8fHxgMBiUh8cWPDB1WeFhC0ajUXmfmppapD15eXkmPclEREREVHaVI9g+AXl5eejQoQOGDh0Kg8GASZMmwWg0Yu3atcjOzsbgwYMxf/58DBo0CMnJyZg3bx7q1KmD2bNnIyEhAUlJSWjTpg28vLxgMBjQrFkzAMC8QiF9586dWLlyJTZs2AAgf1WN3Nxcs+154403oNfrOd0XERER0UPSSAUvfeXg4ICsrCyTbREREWjbtm1Fnrbcbt26paxXDADZ2dmwsbGBVqt96DpzcnKUqb5KUzB+90mJiIjAyZMn4enpifbt2z+xdhARlWrLlkevIz7+0esAntgnjkR/B+YyYEV7antsH1Q41AL5c9w+Kjs7uzKXfZKhloiIiEgNntiAzsIzCdCTxa8FERERqcETCbb29vZITExkoKoEjEYjEhMTodPpICJc+YyIiIj+tp7IUIRGjRrh0qVLuH37NoNUJaDT6RAfHw+dTgcXF5cn3RwiIiKih/JEgq21tTVatmyJ33//HSdPnoRWqy2ymhf9tYxGI1q1aoUmTZo86aYQERERPZQn9vCYRqOBl5cXWrZsWewUWPTXsbKygoODA3vQiYiI6G/ric+KYG9v/1hmICAiIiKipxuXuSIiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVWCwJSIiIiJVYLAlIiIiIlVgsCUiIiIiVSh3sE1PT0evXr3g6emJ8ePHV0SbiIiIiIjKrdzBdtu2bXjjjTcQHh6OzMxMhIeHV0S7iIiIiIjKpdzBtlq1aoiOjkZGRgZu3LiBunXrVkS7iIiIiIjKpdzB9qWXXsK1a9ewcuVKtGjRAq6urkXKrF+/Hp6envD09IRer38sDSUiIiIiKolGRKQ8B4wZMwaffPIJnJ2d8dFHH8HR0RGBgYHFlndwcEBWVtYjN5SIiAgAsGXLo9cRH//odQDAvHmPpx4iFXoSGfChHh47e/YsDAYDTp06BY1GUxHtIiIiIiIql3IH21mzZiEwMBBVqlTBnTt3MHTo0IpoFxERERFRuViW9wAvLy+cO3euItpCRERERPTQuEADEREREakCgy0RERERqQKDLRERERGpAoMtEREREakCgy0RERERqQKDLRERERGpAoMtEREREakCgy0RERERqQKDLRERERGpAoMtEREREakCgy0RERERqQKDLRERERGpAoMtEREREakCgy0RERERqQKDLRERERGpAoMtEREREakCgy0RERERqQKDLRERERE9EWPHjkX79u2xaNGiEsu99dZb2L9/f6n1MdgSERER0V9uz549MBgM+OWXXxAbG4srV66YLXfs2DEkJibCz8+v1DoZbImIiIjoLxcWFoZBgwYBALp164bjx48XKaPT6TBu3Dg0aNAAoaGhpdZp+dhb+QBXV1eEhYVV9GmIiOhp4ej46HU0bfrodQAAf78RFUuv18PT01N5HxgYiMDAQOV9VlYWateuDSA/L54+fbpIHVu3bsWzzz6L6dOnY9WqVbh+/TrefvvtYs9Z4cH2zp078Pb2rujTEBHR02LLlkevIz7+0esAgKFDH089RCpkaWmJ8PDwYvc7OjoiJycHAHDv3j0YjcYiZSIjIxEYGAh3d3cMGzYMc+bMKTHYcigCEREREf3l2rZtqww/iIqKQoMGDYqUady4MWJjYwEA4eHhqF+/fol1VniPLRERERHRg/r164dOnTrh9u3bOHjwIEJCQhAUFGQyQ8LYsWMxZswYhISEQKfTYdeuXSXWqRERqchGOzg4ICsrqyJPQURET5PKNBRh3rzHUw+RCpUlA6anp+PQoUN4+eWX4e7u/sjnZI8tERERET0RVatWVWZGeBw4xpaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVOGhg+1bb72F/fv3P862EBERERE9tIcKtseOHUNiYiL8/Pwed3uIiIiIiB5KuYOtTqfDuHHj0KBBA4SGhlZEm4iIiIiIyq3cwXbr1q149tlnMX36dPz2229YtWpVkTLr16+Hp6cnPD09odfrH0tDiYiIiIhKUu5gGxkZicDAQLi7u2PYsGH46aefipQJDAxEeHg4wsPDYWlp+VgaSkRERERUknIH28aNGyM2NhYAEB4ejvr16z/2RhERERERlVe5u1PHjh2LMWPGICQkBDqdDrt27aqIdhERERERlUu5g62TkxN27txZEW0hIiIiInpoXKCBiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiIVIHBloiIiIhUgcGWiIiIiFSBwZaIiIiInoixY8eiffv2WLRoUYnlkpKS8Pzzz5daH4MtES+R404AABb8SURBVBEREf3l9uzZA4PBgF9++QWxsbG4cuVKsWX/9a9/IScnp9Q6GWyJiIiI6C8XFhaGQYMGAQC6deuG48ePmy135MgRODg4wN3dvdQ6LR9rC81wdXVFWFhYRZ+GiIieFo6Oj15H06aPXgcA8PcbUbH0ej08PT2V94GBgQgMDFTeZ2VloXbt2gDy8+Lp06eL1JGXl4eFCxdi79696NevX6nnrPBge+fOHXh7e1f0aYiI6GmxZcuj1xEf///au/egKK/7j+MfByQTYaiaVJmxFUajptUOBmi4CGQbI5e2GrEh4jgqHZEEWky0jT8crRAjZdWm06l1aik42sRUYtW0nRa8JFkBhUaGop1oDI7amaphmIC0gC0V+f3huBVYLsvucjm+X//I7h7Onue7z1k/++zheVzvQ5KWLXNPP4CBvL29VV1d3evjfn5+9uUFLS0tunv3bo82VqtVmZmZGj9+/ICek6UIAAAAGHKhoaH25Qfnzp1TUFBQjzYnT57U7t27ZbFYVFtbq7S0tD779PgRWwAAAKC7xYsXKyYmRjdu3FBJSYkOHjyozZs3dzlDQllZmf1ni8WiwsLCPvsc09nZ2emxEUvy9fVVa2urJ58CAPAwGUlLEXJy3NMPYKCBZMCmpiadOHFCsbGxA/rjsP5wxBYPp9dfd70P/kMDAMAlEyZMsJ8ZwR1YYwsAAAAjEGwBAABgBIItAAAAjECwBQAAgBEItgAAADACwRYAAABGINgCAADACARbAAAAGIFgCwAAACMQbAEAAGAEgi0AAACMQLAFAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAIxBsAQAAYASCLQAAAIxAsAUAAIARCLYAAAAwAsEWAAAARiDYAgAAwAgEWwAAABiBYAsAAAAjEGwBAABghEEH2/r6ej311FPuHAsAAAAwaIMOtj/84Q91+/Ztd44FAAAAGLRBBdsPPvhAvr6+CggIcPd4AAAAgEFxOti2t7frjTfekNVq7bVNQUGBwsLCFBYWpjt37rg0QAAAAGAgnA62VqtVmZmZGj9+fK9t0tPTVV1drerqanl7e7s0QAAAAGAgnA62J0+e1O7du2WxWFRbW6u0tDRPjAsAAABwitOHU8vKyuw/WywWFRYWunVAAAAAwGC4dB5bm83mpmEAAAAAruECDQAAADACwRYAAABGINgCAADACARbAAAAGIFgCwAAACMQbAEAAGAEgi0AAACMQLAFAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAIxBsAQAAYASCLQAAAIxAsAUAAIARCLYAAAAwAsEWAAAARiDYAgAAwAgEWwAAABiBYAsAAAAjEGwBAABgBIItAAAAjECwBQAAgBEItgAAADACwRYAAABGINgCAADACARbAAAAGIFgCwAAACMQbAEAAGAEgi0AAACMQLAFAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAIxBsAQAAYASCLQAAAIxAsAUAAIARCLYAAAAwAsEWAAAARiDYAgAAwAgEWwAAABiBYAsAAAAjEGwBAABgBIItAAAAjECwBQAAgBEItgAAADACwRYAAABGINgCAADACARbAAAAGIFgCwAAgGGxevVqRUZGatu2bQ4fb25uVmJiouLi4pSUlKT29vY++yPYAgAAYMgdOXJEHR0dqqys1JUrV1RXV9ejzYEDB7R+/XodP35cAQEBKi0t7bNPb08N9r6JEyfKZrN5+mkA58yc6Xof7NfA8PDzc70Pd7wHSLwPAH24c+eOwsLC7LfT09OVnp5uv22z2fTiiy9KkuLi4lRRUaEZM2Z06SMzM9P+c0NDgyZNmtTnc3o82DY2NspisXj6aQDnvP66630sW+Z6HwCct3+/631cu+Z6HxLvA0AfvL29VV1d3evjra2tmjJliqR7B0Jramp6bVtZWammpiZFRET0/ZyDGyoAAAAweH5+frp9+7YkqaWlRXfv3nXYrrGxUVlZWTp8+HC/fbLGFgAAAEMuNDRUFRUVkqRz584pKCioR5v29nYlJycrPz9fgYGB/fZJsAUAAMCQW7x4sd566y2tX79e7777rmbPnq3Nmzd3aVNUVKSamhrl5eXJYrGouLi4zz5ZigAAAIAh5+/vL5vNphMnTmjDhg0KCAhQcHBwlzYZGRnKyMgYcJ8EWwAAAAyLCRMm2M+M4A4sRQAAAIARCLYAAAAwAsEWAAAARiDYAgAAwAgEWwAAABiBYAsAAAAjEGwBAABgBIItAAAAjOD0BRqam5uVkpKijo4O+fr6qri4WD4+Pp4YGwAAADBgTh+xPXDggNavX6/jx48rICBApaWlnhgXAAAA4BSnj9hmZmbaf25oaNCkSZPcOiAAAABgMJwOtvdVVlaqqalJERERPR4rKChQQUGBJOnOnTuDHx0AAAAwQIMKto2NjcrKytLhw4cdPp6enq709HRJkq+v7+BHBwAAAAyQ02ts29vblZycrPz8fAUGBnpiTAAAAIDTnA62RUVFqqmpUV5eniwWi4qLiz0xLgAAAMApTi9FyMjIUEZGhifGAgAAAAwaF2gAAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAIxBsAQAAYASCLQAAAIxAsAUAAIARCLYAAAAwAsEWAAAARiDYAgAAwAgEWwAAABiBYAsAAAAjEGwBAABgBIItAAAAjECwBQAAgBEItgAAADACwRYAAABGINgCAADACN7DPQDAafv3D/cIAADACMQRWwAAABiBYAsAAAAjEGwBAABgBIItAAAAjECwBQAAgBEItgAAADACwRYAAABGINgCAADACARbAAAAGIFgCwAAACMQbAEAAGAEgi0AAACMQLAFAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAIxBsAQAAYATv4R6A8fbvd72PVatc7wMAAMBwHLEFAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAIxBsAQAAYASCLQAAAIxAsAUAAIARuPIYAHgSVx8EgCFDsAWAke71113vIyfH9T4AYIRjKQIAAACMQLAFAACAEQi2AAAAMALBFgAAAEYg2AIAAMAIBFsAAAAYgWALAAAAI3AeWwAAAE/hIi1DimALAAAIYDACwRaAcbL/7vp/0BOC3PMf9P/tc0s3AIABYI0tAAAAjECwBQAAgBFYioAhtX2M633w1S4AAHCEI7YAAAAwAsEWAAAARmApQi/c8ZW5xNfmGBpuWeLR6XofAIYHy7w8g7qOPkYGW7ec6keciw8A3I2gAMCTBhVsV69erQsXLuhb3/qWNm/e7O4xAQ8VPogBAB5WA8mUzuROp9fYHjlyRB0dHaqsrNSVK1dUV1fnbBcAAAB4yA0kUzqbO50+Ymuz2fTiiy9KkuLi4lRRUaEZM2Y42w0AYJTh2wX06/XXXe8jJ8f1PkxjaF0HkimdzZ1OB9vW1lZNmTJFkjRx4kTV1NT0aFNQUKCCggJJUltbm3x9fZ19muE3LtMt3Wztp5s7d+7I27uflyHTPWMZEca53kV/NZUGWFdX7djh2f6d4Yb9dWs/03RIajqSjKT3AHcYKfurO/bVkfIeII2gurrexYip60ipqTQkdTV5X21ra1NYWJj9dnp6utLT0+23B5IpB9LmQU5X0s/PT7dv35YktbS06O7duz3adB84ehcWFqbq6urhHoZxqKv7UVPPoK6eQV09g7q638Nc04FkyoG0eZDTa2xDQ0NVUVEhSTp37pyCgoKc7QIAAAAPuYFkSmdzp9NHbBcvXqyYmBjduHFDJSUlqqqqcrYLAAAAPOS6Z8qDBw9q8+bN2rZtW69t+sudXrm5ubnODOKRRx5RSkqK2tvblZOTo8cff3xQG4P/CQ0NHe4hGIm6uh819Qzq6hnU1TOoq/s9rDXtnimfeOIJPfvss3226S93juns7OR6QwAAABj1nF5jCwAAAIxEBFs3S01N1VNPPaXIyEglJyfrv//9rz777DNZrdZe21+7dm1oB2mI1NRUVVRUqKWlRXPnztVjjz1mP81camqq9u3bJ4vF0uO+0So1NdV+Lr+UlBSlpqb22tZisXS5XVtbq9ra2h7tXn31VbeNr6/9vLsHx+foNbJYLIqMjFRsbKzWrl3rtjH2x9H8HUwfzszp+9tqsVhksVjsfyThbu+9955u3brlkb6HS2trq5KSkvTMM89oxYoVCg8Pt5+8/Q9/+IO++93vOjVvRpPc3Fy9/fbbPe5355zurnu9Ozs7HdZckn70ox8pKipKSUlJamlp6bPf7u9X9fX1iomJ8cg29MfUujY3NysxMVFxcXFKSkpSe3u7x7ZnuBFsPWDXrl2qrKyUn5+fTp48qYCAAGVnZw/3sIyVmZmprKwsfe1rX9PPf/7zHo87um+0OnfuXJd/B6q3YPuzn/3MLeOS5NJ+7ug1OnTokMrKyvTpp5/q4sWLrg5vwLrP36Fw6NAh2Ww22Ww2RUdHe+Q5TAy2b731liIjI3Xq1Ck98sgjevzxx3XixAlJ0vvvv6/4+HhJg583o5E753R33etdXV2thISEHjU/c+aMysvLdfr0acXFxdk/uA5EU1OTVq1apdbWVk9txqCM9roeOHBA69ev1/HjxxUQEKDS0lJPbc6wI9h6SGdnp1paWuTj46Nr1651OUpw9epVRUVF6bnnntOFCxck3fuEGhsbq6efflorVqxQYWGh6uvrlZiYqKioKOXn5w/Tloxs+/btU0tLi1avXi1J+uIXv6j333+/SxtH941WPj4++vzzzzV27FhJ967IIt2rQ29Hozdu3Cir1Sqr1ar58+d3eezBT/S5ubnatGmTYmNjNXfuXH322Wf6z3/+o2XLlumZZ57R8uXL1d7ertDQUCUmJur5559XeHi49uzZI0k99vMbN24oOjpaMTEx2rRpU5/b1dtr1NHRoebmZj366KP9VMa97s/fxsbGHtuQmpqqrVu3KiYmRlFRUbp9+7bDOd3Y2KiFCxcqJibGfrRn2rRpio6O1tKlSzV37lz9+c9/dvj8V69e1fz58xUVFaWdO3dKuvdar1u3TklJSfaxfPrpp/rGN76h8PBw/eY3v5EkffLJJ4qOjlZERITeeOMNSVJiYqJKS0uVkpKiH/zgB54r3BCbMmWKjh49qrq6OhUWFmrTpk32DyMffvihFixYIKnnvDFZf3O6ra1NL7zwgmJjY/W9731P0r1zgyYkJCgmJsZ+ZPB+X6+99pr9A0L3en/9619XfHx8j5ofO3ZM3/zmNzVmzBjFx8c7dXVSLy8vFRcXy9/f3w3VcJ/RXtfMzEz7fGhoaNCkSZNcLcmIRbD1gKysLAUFBWny5Mk9/rpPknbs2KENGzaotLRU//rXvyRJZ86cUXx8vI4ePapbt24pLS1N+fn5Wrp0qc6cOaP33ntPn3/++VBvyoj34Ycf6vLly/YTNq9bt67H0T9H941WwcHBKi4uVnBw8IB/Jz8/X9nZ2crOzu434F++fFllZWVasmSJPvjgA/3617/WnDlzdOrUKc2YMUN79+5VW1ubDh06pPPnz+udd97RX/7yF4d9Xb9+XVarVSUlJfrjH//Y5/M6eo2Sk5M1ffp0felLX1JgYOCAt9dVD87fmTNnOtyGlpYWlZeX68knn9Rf//pXh3P6xz/+sVJSUlReXq7m5maVlpaqs7NT+/fvV319vd58802dPXvWvq0PLsl47bXXtHXrVp0+fVrHjh2zH7F+9913tXPnTuXl5UmSNmzYoJycHJWXl2v79u3q7OzUn/70Jy1ZskRVVVX28z2WlJQoISFBBw8e1JtvvjlUpfS4hQsXat26dVqyZInWrl2r8PBw1dbW6h//+IfGjRunxx57TNLg5o0pus/pgoICzZkzR2VlZbp586bOnz+vmzdvKisrSydPntS1a9dUX18vSaqqqlJkZKSOHTsmqWe9Ozo6HNa8vr5eEydOlHTvw9zChQv10ksv2ZfbWCwWbd261eF4/f399YUvfGFoiuOC0VbX+yorK9XU1KSIiAjPFmgYEWw9YNeuXcrIyND06dM1ZsyYHo9fvXpVwcHB8vb21ty5cyXd20l/97vfaenSpXrllVckSZcuXdIvf/lLWSwWtba26saNG0O6HaPB7t27FRwcrAMHDkiSQkJC9M9//lNXrlyxt3F032gVEhKiffv2KSQkpMv996/K4qqVK1dKkqZOnar29nZduHBB4eHhkqSIiAhdvHhRkydPlp+fnwIDA+Xl5aXeTqzi7e0tq9WqtLQ0e9jrjaPX6NChQ6qrq1NHR4fDNW+e8uD87W0bVq1aJel/dXI0px+sXXh4uC5evGivWVBQUJfa3V+KcP+KjRcvXlR4eLjGjBmjsLAwffLJJ5KkZcuW6YknnrCP49KlS8rJyVFcXJw6Ojp069YtrVixQufPn9eCBQvU3Nzs+YINo7q6OiUkJKi2tlYNDQ16++23FRoaqu3btysuLs7errd58zDoPqcvXbqko0ePymKx6MqVK7p+/brGjh2rwsJCLV++XI2Njfb3kzlz5mjJkiX2vhzV28vLq0fN/f397es/P/roI+3cuVO/+tWv7MttbDabtmzZMsSVcK/RWNfGxkZlZWVp7969nirLiECw9ZCXXnpJRUVF6ujo6PHY1KlT9fHHH6ujo0N/+9vfJEm///3vtXfvXlVUVOi5556TJM2aNUtWq1U2m03Z2dn2T2r4Hz8/P+Xm5iovL8/+hz5r165VeXl5l3aO7huNQkJCdPbsWYWEhOidd95RQ0ODJPW7XurRRx9VW1ubJPUaRCXJ19e3y+3Zs2fbT4ZdVVWl2bNnD3isP/3pT7Vx40YVFhY6/IDXnaPXaOzYsfL39+83GLvb/fnb2zZ0r5OjOe1K7b761a+qqqpKnZ2dOnv2rL7yla9Iure/P2jWrFnat2+fbDabvv/978vHx0c2m02bNm1SaWmpdu7caZ8XD+4DpigsLNTRo0fl5eWlOXPm6N///rcSEhK0Z88eJSQk2Ns9OG8eNt331VmzZunVV1+VzWbTtm3bNHXqVBUVFemFF17Qb3/72y7tu+9vjuotqUfN582bZ18feurUqSFfSjQURltd29vblZycrPz8/CH9Bmw4EGw9ZMKECXr22Wd1+PDhHo9t2LBB27Zt04IFC+Tj4yPp3smZv/Od72j+/PlauXKlrl+/ruzsbP3kJz/RvHnzVFpaqsmTJw/1ZowK06dPV2xsrC5fvixJev755zVt2rQubRzdNxoFBQVp5syZCgwM1KJFi7Rr1y69/PLL9q9ce7NgwQIdOXJE8+bNcyrgp6Wl6eOPP1ZsbKzq6uqc+ovyb3/723r55Ze1aNEijRs3TtevX++zfffXKDk5WREREbp586aWL18+4Od1h/vzd6Db4GhOb9y4UQcPHlR0dLTGjx/f5Qhif3bs2KEtW7YoKipKCQkJevLJJx22s1qtWrNmjSIiInT16lX5+vpq2rRpWrVqlaKiopSYmGhfV7py5UqtWbNGTz/9tNuO8A+3V155xX4WjY8++kgrVqxQfHy8/Pz87EfLpa7zxiRbtmxRWFiYwsLC9Itf/GJAv7NmzRqVlJQoNjZWe/bs0Ze//GUtWLBA+fn59qVzve3njuotqUfNFy1apGnTpikqKkrl5eVd1peOBibWtaioSDU1NcrLy5PFYlFxcfGAf3e04QINI0Rubq5Onz4tLy8veXt7a/v27U4d4QEAAHjYEWwBAABgBJYiAAAAwAgEWwAAABiBYAsAAAAjEGwBAABgBIItAAAAjECwBQAAgBH+H/JDtDEbBzmoAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 720x504 with 2 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 数据 \n",
    "result = [[x[i] for x in result] for i in range(5)]\n",
    "training_time,test_time,traning_err,test_err,clf_names =  result\n",
    "\n",
    "training_time = np.array(training_time).astype(np.float)\n",
    "test_time = np.array(test_time).astype(np.float)\n",
    "traning_err = np.array(traning_err).astype(np.float)\n",
    "test_err = np.array(test_err).astype(np.float)\n",
    "\n",
    "# 可视化\n",
    "x = np.arange(len(training_time))\n",
    "plt.figure(figsize=(10,7),facecolor='w')\n",
    "ax = plt.axes()\n",
    "b0 = ax.bar(x+0.1,traning_err,width=0.2,color='#77E0A0')\n",
    "b1 = ax.bar(x+0.3,test_err,width=0.2,color='#8800FF')\n",
    "ax2 = ax.twinx()\n",
    "b2 = ax.bar(x+0.5,training_time,width=0.2,color='#FFA0A0')\n",
    "b3 = ax2.bar(x+0.7,test_time,width=0.2,color='#FF8080')\n",
    "plt.xticks(x+0.5,clf_names)\n",
    "plt.legend([b0[0],b1[0],b2[0],b3[0]],(\"训练集错误率:\",\n",
    "                                   \"测试集错误率\",\n",
    "                                  \"训练时间\",\n",
    "                                  \"预测时间\"),\n",
    "                                  loc = 'upper left',\n",
    "                                  shadow = True)\n",
    "\n",
    "plt.title(\"酒店评论文本分类及不同分类器比较\",fontsize=18)\n",
    "plt.xlabel(\"分类器名称\")\n",
    "plt.grid(True)\n",
    "plt.tight_layout(2)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 结果分析\n",
    "\n",
    "- 多种分类器中，SVM和KNN依然训练时间高于其他分类，Ridg线性分类器训练时间也相对较长。但是SVM虽然训练时间长，其分类准确率在训练集和测试集都很高，表现良好。KNN在文本分类中效率较为低下，耗时长且错误率高。其余的分类器表现差异不大。\n",
    "\n",
    "- 但在模型训练过程中，LinearSVC-l1报了收敛警告，LinearSVC-l2没有收敛警告，且表现不错。对比高斯核SVM(利用SVC类),线性核LinearSVC-l1(利用LinearSVC类，hinge损失的绝对值，惩罚项是l1)以及线性核LinearSVC-l2(利用 hinge损失的绝对值，惩罚项是l2)。 高斯核SVM虽然时间消耗大，但是表现良好。由于非线性核函数，该项目中是针对文本情感分类，利用CBOW模型进 行向量化，提取部分特征之后非线性变化。分析认为是文本信息和数值数据本身就不一样，文本数据的模型刻画中已 经进行了很多抽象化的操作，所以非线性变化较为合适。\n",
    "\n",
    "- LinearSVC类是基于liblinear,罚函数是对截矩进行惩罚，损失函数是基于hing损失的平方，可以用梯度下降优化。 SVC基于libsvm，罚函数不是对截矩进行惩罚，损失函数基于hing损失(非凸)，无法使用梯度下降求解。SVM解决问 题时，问题是分为线性可分和线性不可分问题的，liblinear对线性可分问题做了优化，故在大量数据上收敛速度比 libsvm快。这也合理的解释了上图中时间维度信息。 对于不同的惩罚函数，线性核LinearSVC-l1采用l1作为惩罚项，线性核LinearSVC-l2采用l2作为惩罚项，LinearSVC-l1 有收敛性警告，但LinearSVC-l2没有，分析原因主要是该项目中在向量化之前已经进行了特征筛选，剩余1000个特征 词汇，对于一条评论命中一个词汇概率已经较低。LinearSVC-l1利用l1作为惩罚项，本身就具备特征筛选的作用，对 于很多评论而言，这很可能导致参数为0，所以收敛性无法保证。\n",
    "\n",
    "通过三者对比，特征选择与否以及数据本身特征对于模型的选择和建立很重要。"
   ]
  }
 ],
 "metadata": {
  "hide_input": false,
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
